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ABSTRACT 


The  aim  of  this  research,  which  was  performed  as  a  Lincoln  Laboratory  Inno¬ 
vative  Research  Program  (IRP)  project,  was  to  apply  advanced  digital  speech  and 
signal-processing  techniques  toward  improving  cochlear  implant  electrode  stimula¬ 
tors.  By  providing  a  flexible  stimulator  whose  function  could  be  tuned  depending 
on  the  subject’s  residual  auditory  nerves  and  the  efficiency  of  the  implant’s  coupling 
to  those  nerves,  it  was  hypothesized  that  the  subject’s  speech  reception  could  be 
improved.  The  approach  to  providing  these  new  and  improved  electrode  stimulators 
incl ;  ed  the  design  of  a  laboratory  signal  processor  used  for  interactive  testing  of 
nev  algorithms  with  implant  subjects.  This  Programmable  Interactive  System  for 
Cochlear  Implant  Electrode  Stimulation  (PISCES)  was  designed,  built,  and  tested 
at  Lincoln  Laboratory  and  then  delivered  to  the  Massachusetts  Eye  and  Ear  Infir¬ 
mary  (MEEI)  Cochlear  Implant  Research  Laboratory  (CIRL).  In  collaboration  with 
researchers  at  MEEI  CIRL  and  MIT  Research  Laboratory  of  Electronics  (RLE), 
new  algorithms  run  on  PISCES  have  resulted  in  substantial  improvements  in  sub¬ 
ject  speech  reception  relative  to  that  with  their  current  implant  stimulators.  These 
results  were  obtained  as  a  result  of  interactive  algorithm  adjustment  at  the  clinic, 
which  demonstrated  the  importance  of  a  flexible  signal  processor. 
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1.  INTRODUCTION 


Over  300,000  people  in  the  United  States  suffer  from  a  profound  hearing  loss.  In  ihese  cases, 
treatment  via  conventional  hearing  aids  is  ineffective.  In  most  of  these  cases,  an  array  of  electrodes 
can  be  surgically  implanted  to  excite  surviving  1  ,i.er  ear  auditory  neurons.  These  electrodes  are 
stimulated  by  devices  that  transduce  acoustic  waves  (speech,  music,  noise,  etc.)  to  electric  signals. 
The  signal  processing  performed  by  the  electrode  stimulators  is  moie  complex  than  the  frequency 
dependent  amplification  performed  by  a  conventional  hearing  aid. 

The  aim  of  this  Innovative  Research  Program  (IRP)  project  was  to  apply  Lincoln  Labora¬ 
tory  digital  speech  and  signal-processing  expertise  toward  improving  cochlear  implant  electrode 
stimulators.1  By  providing  a  flexible  stimulator  whose  function  could  be  tuned  depending  on  the 
subject’s  residual  nerves  and  the  efficiency  of  the  implant’;  coupling  to  those  nerves,  it  was  hy¬ 
pothesized  that  speech  reception  could  be  improved.  The  approach  to  providing  these  new  and 
improved  electrode  stimulators  included  the  design  of  a  laboratory  signal  processor  used  for  inter¬ 
active  testing  of  new  algorithms  with  implant  subjects.  This  Programmable  Interactive  System  for 
Cochlear  Implant  Electrode  Stimulation  (PISCES)  was  designed,  built,  and  tested  at  Lincoln  Lab¬ 
oratory  and  then  delivered  to  the  Massachusetts  Eye  and  Ear  Infirmary  (MEEI)  Cochlear  Implant 
Research  Laboratory  (CIRL).  In  collaboration  with  researchers  at  MEEI  CIRL  and  MIT  Research 
Laboratory  of  Electronics  (RLE),  new  algorithms  were  designed  and  run  on  PISCES  and  have 
resulted  in  speech  reception  improvements  for  implant  subjects  relative  to  their  current  implant 
stimulators.  These  improvements  were  obtained  as  a  result  of  interactive  algorithm  adjustment  at 
the  clinic,  which  demonstrated  the  importance  of  a  flexible  signal  processor. 

This  report  summarizes  the  design,  implementation  and  testing  activities  of  the  IRP  project. 
Chapter  2  describes  the  cochlear  implant  and  the  conditions  that  make  it  necessary.  Chapter  3 
outlines  the  PISCES  hardware  and  software  design.  Chapter  4  details  the  algorithms  that  have 
been  implemented  on  PISCES  and  used  in  clinical  interactions.  Chapter  5  reports  the  results  of 
testing  with  six  implant  subjects.  Finally,  Chapter  6  discusses  the  consequences  of  this  IRP  effort 
and  the  subsequent  follow-on  work. 


*In  this  report,  the  phrases  “electrode  stimulator”  and  “implant  stimulator”  refer  to  all  of  the 
processing  that  converts  an  acoustic  signal  to  a  current  source  output  used  to  drive  an  implant 
electrode.  The  term  “processor'1  has  been  avoided  as  it  can  denote  a  laboratory-based  computer,  a 
microcomputer  (DSP  chip)  within  the  laboratory  computer,  a  portable  analog  or  digital  acoustic- 
to-current  transducer,  or  an  algorithm  running  in  a  digital  signal  processor  (either  laboratory  based 
or  portable). 
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2.  THE  COCHLEAR  IMPLANT  FOR  SENSORY/  NEURAL  DEAFNESS 


In  the  healthy  human  peripheral  auditory  system,  sound  perception  begins  when  an  incident 
acoustic  wave  causes  the  ear  drum  to  vibrate  [2].  As  shown  in  Figure  1,  this  vibration  is  coupled 
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Figure  1.  Block  diagram  of  the  peripheral  auditory  system. 


through  three  small  bones  in  the  middle  ear  to  the  cochlea  of  the  inner  ear.  The  cochlea  is  a 
helical  structure  that  is  surrounded  by  bone  and  filled  with  fluid.  The  basilar  membrane  extends 
along  the  cochlea  for  about  35  mm  in  the  same  helical  shape.  Sound  vibrations  that  are  conducted 
through  the  middle  ear  to  the  cochlea  cause  the  basilar  membrane  to  vibrate.  The  end  of  the 
basilar  membrane  near  the  coupled  input  at  the  oval  wi-dow  responds  to  high  frequencies,  while 
the  end  farthest  from  the  input  responds  to  the  lowest  frequencies.  Along  the  basilar  membrane  is 
the  organ  of  Corti,  an  elaborate  structure  containing  thousands  of  hair  cells.  The  bending  motion 
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me  causes  small  cilia  on  these  hair  cells  to  release  neurotransmitters  which 
y  neurons.  Although  the  auditory  neurons  transmit  a  rich  variety  of  signals 
t  basic  parameter  is  simply  the  place  at  which  the  Deuron  originates  on  the 
which  neurons  are  active  is  one  cue  the  brain  uses  to  map  the  frequency 
;nals.  One  measure  of  signal  intensity  is  the  total  neuronal  activity.  Shown  in 
Is  of  the  cochlea  in  cross  section.  The  tympanic  canal  is  the  passage  to  which 
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Figure  2.  The  cochlear  implant  in  place. 


the  round  window  alio  ess,  while  the  vestibular  canal  is  terminated  by  the  oval  window  driven 
from  the  stapes.  These  two  spaces  are  separated  by  a  third  space  called  the  medial  or  cochlear 
canal.  The  frequency  selective  basilar  membrane  forms  one  wall  of  this  medial  space  which  also 
contains  the  organ  of  Corti  as  well  as  the  auditory  neurons.  The  30,000  auditory  neurons  provide 
all  the  information  used  by  the  brain  to  perceive  the  acoustic  environment. 

One  common  form  of  deafness  is  associated  with  the  gradual  deterioration  of  the  middle  ear 
coupling  bones  [14].  Because  this  deterioration  results  in  attenuation  in  the  path  from  acoustic 
input  to  the  basilar  membrane,  this  condition  is  sometimes  treated  with  conventional  (i.e.,  lir?ar 
filtering)  hearing  aids  or  surgery.  Another  source  of  severe  hearing  impairment  involves  the  loss  of 
sensory  hair  cells  and/or  the  sensory  neurons  connected  to  them.  This  form  of  hearing  loss  may  be 
caused  by  extremely  loud  sounds  that  damage  the  sensitive  hair  cell  cilia,  by  the  effect  of  certain 
drug  treatments  on  the  cilia,  by  disease  which  may  destroy  the  hair  cells  and  the  neurons,  or  by 
the  lack  of  transducer  mechanisms  due  to  a  congenital  condition.  About  300,000  cases  of  sensory 
hearing  impairment  exist  in  the  United  States  alone.  The  left-hand  panel  of  Figure  3  shows  a 
healthy  array  of  hair  cells  and  neurons.  Hair  cells  are  shown  as  loops  on  the  basilar  membrane, 
and  neurons  are  shown  connecting  hair  cells  to  the  auditory  nerve.  The  auditory  nerve  connects  to 
the  brain  stem.  In  contrast,  the  right-hand  panel  of  Figure  3  shows  degeneration  associated  with 
sensory/neural  deafness. 


Figure  S. 


Patterns  of  nerve  and  hair  cell  survival  for  a  normal  and  deaf  ear. 
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Cochlear  implants  were  developed  to  substitute  for  damaged  hair  cells  and  other  elements  in 
the  cochlea  and  to  bypass  the  external  and  middle  ear  pathways  [4].  The  implant  is  an  electrode 
array  that  provides  electric  fields  in  close  proximity  to  the  remaining  nerve  fibers  when  excited 
by  a  stimulator.  The  electrode  stimulator  is  a  signal-processing  device  that  converts  incoming 
acoustic  signals  into  stimuli  appropriate  for  the  implant  electrodes.  A  survey  of  cochlear  implants 
and  stimulators  was  undertaken  by  Ifukube  [5]. 

The  insertion  of  a  cochlear  implant  is  a  surgical  procedure  that  varies  with  the  type  of  electrode 
array  to  be  used  as  well  as  the  place  at  which  it  is  meant  to  establish  electrical  currents  and  fields. 
Early  implants  comprised  a  single  electrode  in  the  middle  ear  dose  to  the  cochlea,  while  modern 
procedures  snake  an  electrode  assembly  into  the  cochlear  cavity  to  interact  with  a  wide  range  of 
remaining  sensory  neurons  [7].  Figure  2  shows  such  an  electrode  assembly  inserted  through  the 
membrane  of  the  round  window.  The  electrode  assembly  comprises  multiple  insulated  wires.  Each 
wire  is  connected  at  one  end  to  a  conducting  contact  placed  near  the  basilar  membrane.  These 
contacts  are  distributed  along  the  length  of  the  basilar  membrane.  The  other  end  of  each  insulated 
wire  connects  eventually  to  the  electrode  stimulator.  The  electrodes  can  be  excited  as  balanced 
pairs  or  as  monopolar  electrodes  with  a  common  return. 

The  human  subjects  tested  as  part  of  this  project  have  had  Ineraid  [3]  electrode  arrays  sur¬ 
gically  implanted.  These  implants  consist  of  six  electrodes  distributed  along  the  first  20  to  24  mm 
of  the  cochlea  and  two  extracochlear  electrodes  that  can  be  used  as  ground  returns.  Each  subject 
wears  on  his  belt  an  Ineraid  four-channel  stimulator  designed  to  excite  four  of  the  implant  elec¬ 
trodes.  During  visits  to  CIRL,  the  Ineraid  stimulator  was  replaced  by  PISCES,  the  programmable, 
interactive  system  for  cochlear  implant  electrode  stimulation  described  in  Chapter  3. 
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3.  THE  PISCES  SYSTEM 


This  chapter  describes  the  PISCES  hardware  and  software  used  to  convert  acoustic  speech 
input  into  signals  suitable  for  stimulating  cochlear  implant  electrodes.  To  test  new  implant  electrode 
stimulation  strategies,  PISCES  was  used  in  place  of  the  Ineraid  hardware  stimulator  in  laboratory 
experiments. 

3.1  Hardware  Configuration 

A  block  diagram  of  the  PISCES  hardware  is  shown  in  Figure  4  and  a  photograph  is  shown  in 


OUTPUT  TO 
PATIENT  ELECTRODES 


Figure  J.  Block  diagram  of  the  PISCES  hardware. 


Figure  5.  The  following  sections  outline  the  major  PISCES  hardware  modules. 

3.1.1  Host  Computer 

The  host  computer  fulfills  many  objectives.  First,  it  functions  as  a  general-purpose  computer, 
performing  non-real-time  floating-point  electrode  stimulation  simulations.  Next,  it  gives  the  user 
the  ability  to  review  graphically  the  effects  of  a  given  stimulation  algorithm  both  for  debugging 
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Figure  5.  Photograph  of  the  PISCES  hardware. 


purposes  and  for  comparison  against  other  algorithms.  Additionally,  when  the  host  computer 
controls  a  special-purpose  DSP  board,  its  tasks  include  downloading  the  stimulation  algorithms 
and  parameters  to  the  DSP  board;  initializing  and  interrupting  board  processing;  and  passing  data 
among  the  user,  the  DSP  board,  and  the  disk  file  system. 

After  considering  both  UNIX  workstations  and  IBM-PC/AT  compatibles,  a  Sun  Microsystems 
SPARCstation  IPC  was  chosen  to  serve  as  the  PISCES  host.  The  IPC  is  an  inexpensive,  4.2- 
MFLOPS  UNIX  workstation  incorporating  a  25-MHz  SPARC  integer  and  floating-point  processor. 
It  has  spare  S-Bus  slots  for  installation  of  peripheral  hardware.  With  regard  to  software,  the 
IPC  comes  loaded  with  SunOS  (Sun’s  version  of  UNIX)  and  Open  Windows  (Sun’s  version  of  X- 
windows),  thereby  affording  a  multiuser,  multitasking  environment  not  commonly  available  on 
IBM-PC/AT  compatibles.  Although  DSP  processing  boards  are  much  more  widely  available  for  the 
IBM-PC/AT  platform,  the  Sun/UNIX  expertise  accumulated  by  the  Lincoln  and  MEEI  personnel 
prior  to  this  project  heavily  influenced  the  selection  of  the  IPC. 

3.1.2  DSP  Board 

Several  DSP  boards  were  commercially  available  for  incorporation  into  the  IPC.  Each  board 
contains  a  single  DSP  chip,  fast  memory,  serial  interfaces,  and  an  S-Bus  interface  to  the  IPC. 
Boards  containing  the  Motorola  56000,  AT&T  DSP32C,  and  Texas  Instruments  TMS320C30  were 
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considered.  Boards  using  the  56000  were  disqualified  early  on,  as  the  56000  is  a  fixed-point  pro¬ 
cessor.  Given  the  availability  of  fast,  floating-point  processors,  there  was  a  desire  to  bypass  the 
complications  of  fixed-point  arithmetic.  The  DSP32C  was  considered  but  disqualified  due  to  the 
explicit  pipelined  nature  of  its  assembly  code.  Although  it  was  anticipated  that  much  of  the  soft¬ 
ware  would  be  written  in  C,  it  seemed  inevitable  that  some  assembly  coding  would  be  required, 
and  it  was  the  experience  of  the  Lincoln  personnel  that  the  DSP32C  is  not  easily  programmed  in 
assembly  language.  The  TMS320C30,  a  floating-point  chip  that  is  easily  programmed  in  assembly 
language,  had  been  employed  successfully  for  other  projects  at  Lincoln;  thus,  it  was  chosen  as  the 
PISCES  DSP  chip. 

At  the  onset  of  the  IRP  project,  only  one  DSP  board  vendor  manufactured  a  TMS320C30- 
based  board  compatible  with  the  IPC.  Sonitech  Incorporated’s  Spirit-30  S-Bus  card  comprises  a 
33-MFLOPS  TMS320C30,  an  S-Bus  interface,  2  Mbytes  of  zero  wait  state  RAM,  and  two  serial 
ports.2  Furthermore,  the  Spirit-30  supports  the  SPOX  operating  system  (see  Section  3.2  below), 
which  eases  the  software  migration  from  non-real-time  workstation  simulations  to  real-time  DSP 
board  implementations.  Figure  6  shows  a  block  diagram  of  the  Spirit-30  card. 


Subsequently,  Loughborough  Sound  Images  has  introduced  an  S-Bus-based  C30  card  quite  similar 
to  the  Spirit-30. 


IOHIM 


TO  SPARCSTATtON 


TO  BACKPLANE 


Figure  6.  Block  diagram  of  the  Soniiech  Spirii-30  card. 


3.1.3  Analog  Interface 

To  provide  an  A/D  and  D/A  capability  for  the  Texas  Instruments  TMS320C30  DSP  chip 
used  by  the  Sonitech  board,  the  Flexible  Lincoln  Audio  Interface  (FLAIR)  board  was  designed  to 
connect  to  the  serial  I/O  port  of  the  TMS320C30  chip.  The  FLAIR  board  provides  a  twochannel 
A/D  input  stream  using  the  Crystal  CS5336  16-bit  delta-sigma  modulation  converter.  As  the  chip 
performs  oversampling  followed  by  digital  filtering  and  downsampling,  it  exhibits  a  high  signal-to- 
noise  ratio  over  a  wide  range  of  sampling  rates.  This  approach  eliminates  the  need  for  external 
antialiasing  filters. 

The  D/A  output  from  the  FLAIR  board  is  provided  by  Burr  Brown  PCM56P  16- bit  converters 
that  have  a  settling  time  of  1.5  ^s.  Between  the  C30  chip  and  the  D/A  converters,  Nippon  Precision 
Circuits  SM5813AP  upsampling  FIR  filter  chips  can  be  optionally  engaged.  Each  SM5813AP 
provides  two  channels  of  a  1:8  upsampling  and  associated  digital  low-pass  filtering.  Upsampling 
and  filtering  prior  to  D/A  conversion  eliminates  the  need  for  a  sharp  analog  smoothing  filter.  The 
converters  can  also  be  driven  directly  from  the  C30,  bypassing  the  upsampling  and  filtering  stages, 
to  generate  pulse  signals  at  the  sampling  interval  width. 

The  FLAIR  board  assembly  that  is  used  in  PISCES  provides  two  channels  of  16-bit  A/D  input 
and  eight  channels  of  16-bit  D/A  output.  Sampling  rate,  use  of  upsampling  or  direct  outputs,  and 
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the  number  of  A/D  and  D/A  channels  are  specified  under  software  control.  The  hardware  assembly 
can  be  easily  extended  to  provide  additional  D/A  outputs. 

3.2  Software  Environment 

Development  of  implant  stimulation  software  on  PISCES  generally  takes  advantage  of  the 
three  processing  modes  shown  in  Figure  7  and  described  below.  Detailed  descriptions  of  the  actual 
stimulation  algorithms  are  postponed  until  Chapter  4. 


2W7W-7 


HOST  FILE-TO-FILE  MODE 


-S  S  -9  -7 

0  3  4  9 
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0  3  2  8 

0  3  4  9 

SINGLE-CHANNEL  INPUT 
(Fil«  From  Host  File  System) 


MULTICHANNEL  OUTPUT 
(Files  to  Host  File  System) 


C30  FILE-TO  FILE  MODE 


_5  _a  -9  -7 

0  3  4  9 

-6-9-7  0 

3  4  9  8 

8  7-14 

4  3  8  9 

C30  DSP  CHIP  1  w 

7-143 

4  3  8  9 

5-3-2  1 

PROCESSING 

6-3-2  1 

-6  5  3  4 

-6  5  0  4 

SINGLE-CHANNEL  INPUT  MULTICHANNEL  OUTPUT 

(File  From  Host  File  System)  (Files  to  Host  File  System) 


REAL-TIME  MODE 


REAL-TIME 
INPUT  FROM 
AA) 


C30  DSP  CHIP 
PROCESSING 


REAL-TIME 
OUTPUT TO 
O/A 


SINGLE-CHANNEL  INPUT  MULTICHANNEL  OUTPUT 

(Digital  Stream  From  A/D)  (Digital  Stream  to  D/A) 


Figure  7.  Block  diagram  of  the  three  operating  modes. 
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3.2. 1  Step  1:  Host  File-to-File  Mode 

The  purpose  of  a  host  file-to-file  mode  is  to  allow  the  researcher  to  explore  a  range  of  signal¬ 
processing  algorithms  to  be  used  for  electrode  stimulation.  By  processing  test  files  (such  as  sums 
of  sine  waves,  noise,  chirp  signals,  and  speech),  the  algorithms  can  be  debugged  and  evaluated 
under  tightly  controlled  conditions.  All  programs  are  written  in  C  using  the  standard  math  and  file 
I/O  libraries.  Programs  are  debugged  using  high-level  debugging  tools  such  as  Sun’s  dbxtool,  an 
Open  Windows-based,  symbolic  source  code  debugger.  Finally,  the  resulting  multichannel  output 
files  are  displayed  using  interactive  waveform  and  spectrogram  display  packages  such  as  Entropic 
Research  Laboratory’s  waves*.  The  final  result  of  step  1  is  a  debugged  C  program  that  takes 
single-channel  speech  files  as  input  and  produces  multichannel  electrode  stimulation  files  as  output. 

3.2.2  Step  2:  C30  File-to-FUe  Mode 

The  next  step  in  the  software  development  process  is  porting  the  C  code  from  the  Sun  host  to 
the  Sp(rit-30  board.  An  advantage  of  using  the  TMS320C30-bascd  Spirit-30  board  is  the  availability 
of  Spectron  Microsystems’  SPOX  operating  system.  Although  TI  provides  an  ANSI  C  compiler  for 
converting  C  code  to  TMS320C30  assembly  code,  SPOX  eases  the  porting  process  by  allowing  the 
user  to  retain  I/O  with  the  host  operating  system  through  the  use  of  commonly  used  file  I/O  routines 
such  as  fprintf ,  f write,  fscanf,  and  fread.  In  this  manner,  the  file-to-file  C  program  written 
for  the  Sun  can  be  recompiled  and  run  on  the  C30  with  very  few  changes.3  All  interaction  between 
the  host  and  the  TMS320C30  board  required  to  effect  transfer  of  data  from  disk  to  TMS320C30  is 
performed  (from  the  user’s  point  of  view)  invisibly.  Using  SPOX  and  comparing  the  output  of  the 
host  file-to-file  system  with  the  C30  file-to-file  system,  the  user  can  identify  quickly  any  compiler 
or  floating-point  inconsistencies  between  the  Sun  and  TI  CPUs.  Additionally,  C30  file-to-file  mode 
eases  the  transition  to  real  time  by  allowing  the  user  to  convert  from  C  to  C30  assembly  language 
those  subroutines  containing  critical  loops  that  are  identified  using  the  on-board  C30  timer.  Both 
the  correctness  and  the  effective  speed-up  of  these  conversions  to  assembly  language  can  be  verified 
using  tdst  files  as  input. 

i 

3.2.3  Step  3:  Real-Time  Mode 

For  the  real-time  mode,  the  C  and  assembly  code  tested  in  the  C30  file-to-fiie  mode  are 
extended  to  support  input  from  and  output  to  the  FLAIR  board  in  addition  to  the  host  file  system. 
Prior  to  entering  this  mode,  the  user  has  an  in-hand  code  that  has  been  verified  correct  in  the  C30 
file-to-file  mode;  thus,  debugging  attention  can  be  focused  on  porting  to  the  analog  interface.  C- 
callable  assembly  language  subroutines  are  available  to  the  programmer  for  easy  setup  of  the  analog 


3However,  careful  attention  to  memory  management  is  required.  Programmers  are  advised  to  use 
SPOX  memory  allocation  routines,  which  differ  slightly  from  traditional  C. 
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interface  parameters  (sampling  rate,  number  of  channels,  and  anti-imaging  Alter  specification), 
definition  of  input  and  output  memory  buffers,  and  initiation  and  termination  of  analog  conversion. 
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4.  ALGORITHMS 


4.1  Digital  Simulation  of  the  Ineraid  Stimulator 

The  first  algorithm  implemented  on  PISCES  was  a  digital  simulation  of  the  analog  processing 
performed  by  the  Ineraid  stimulator  as  shown  in  Figure  8.  The  signal  is  first  fed  to  an  automatic 


INPUT  SPEECH 
FROM  MICROPHONE 


TO  ELECTRODE  #1 


TO  ELECTRODE  #2 


TO  ELECTRODE  *3 


TO  ELECTRODE  #4 


ADJUSTABLE 
VOLTAGE  TO  CURRENT 
CONVERTERS 


Figure  8.  Signal  processing  performed  by  the  Ineraid  hardware  stimulator  and  the  digital 
simulation. 


gain  control  (AGC)  which  applies  dynamic  range  compression  (DRC).  The  output  of  the  AGC/DRC 
is  sent  to  a  four-channel  filter  bank.  In  the  analog  hardware,  each  resulting  continuous  waveform 
output  from  the  filter  bank  drives  a  voltage- to-current  converter  whose  gain  is  adjusted  to  reflect  the 
threshold  measured  for  the  subject’s  corresponding  electrode.  In  the  digital  implementation,  the 
outputs  of  the  filter  bank  are  sent  to  a  D/A  converter  after  which  output  gain  and  voltage- to-current 
conversion  are  applied  by  external  hardware.  In  both  cases,  the  output  of  the  voltage-to-current 
converters  are  used  to  stimulate  the  Ineraid  implant  electrodes. 

The  main  advantage  of  the  digital  simulation  over  the  analog  implementation  is  the  degree  of 
flexibility  afforded  to  the  clinical  staff  in  fitting  a  stimulator  to  an  individual  subject.  In  the  analog 
implementation  only  gain  parameters  may  be  adjusted.  In  the  digital  simulation,  essentially  all  of 
the  AGC/DRC  and  filter  bank  characteristics  may  be  specified  at  run  time.  The  following  sections 
summarize  the  digital  simulation  software,  demonstrate  its  flexibility,  and  show  example  input  and 
output  signals  for  a  typical  configuration. 
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4.1.1  The  cbank  Program 

cbank  is  a  C  language  digital  simulation  of  the  Ineraid  stimulator.  The  program  has  been 
compiled  and  run  successfully  on  both  a  Sun  SPARCstation  IPC  and  the  Sonitech  TMS320C30 
board.  For  the  C30  board,  a  few  key  subroutines  were  optimized  in  assembly  language  to  achieve 
real-time  performance. 

cbank  was  written  and  tested  using  the  three-step  procedure  outlined  in  Section  3.2.  When 
running  on  the  Sun,  cbank  reads  a  single-channel  sampled  data  file  from  the  host  file  system  and 
produces  a  multichannel  file  as  output.  When  running  on  the  C30  board,  the  user  may  instruct 
cbank  to  either  read  and  write  files  from/to  the  host  file  system  through  SPOX  or  read  and  write 
from/to  the  A/D  and  D/A  converters. 

Considerable  flexibility  is  available  to  the  user  through  the  use  of  command  line  arguments 
and  specification  flies.  These  arguments  and  files  allow  the  user  to  configure  a  wide  range  of  cbank 
parameters  at  run  time,  rather  than  at  compile  time,  thereby  enhancing  the  user’s  ability  to  work 
with  a  subject  interactively.  Parameters  that  can  be  varied  include  sampling  rate,  number  of 
filters  in  the  filter  bank,  filter  shapes,  AGC/DRC  characteristics,  I/O  type  (disk  versus  A/D  and 
D/A),  etc.  Table  1  and  Figures  9  and  10  show  the  complete  list  of  command  line  argumerAo,  an 
example  main  parameter  specification  file,  and  an  example  DRC  file,  respectively.  The  parameter 
specification  file  of  Figure  9  identifies  four  files  each  containing  coefficients  for  an  FIR  filter.  These 
four  files  are  read  during  program  initialization,  and  the  corresponding  coefficients  are  stored  in 
memory.  In  addition,  the  use  of  AGC,  as  well  as  the  AGC  attack  and  decay  time  constants,  is 
specified. 


4.1.2  Automatic  Gain  Control/Dynamic  Range  Compression 

In  the  digital  simulation,  the  AGC/DRC  is  implemented  as  follows: 

•  For  each  input  sample,  determine  whether  the  AGC  is  in  attack  mode  or  release 
mode. 

•  Depending  on  the  mode,  calculate  a  new  estimate  of  the  signal  envelope. 

•  Civen  the  signal  envelope  estimate,  calculate  an  appropriate  gain  and  apply  this  gain 
to  the  sample.  ^ 

Defining  a:[n]  as  the  output  of  the  A/D  converter  at  time  n,  the  attack  mode  envelope 
is  defined  as 

JUN  f  <*A\z[n]\  +  0AV[n  -  1]  .  (1) 

The  corresponding  release  mode  envelope  is  defined  as 
!te[n]  =  Ofl|x[n]|  +  0Hy[n  -  lj 


(2) 


TABLE  1 


The  cb&nk  Comiand  Line  Arguments 


Flag 

Value 

Type 

Description 

Default 

String 

Input  speech  file 

(No  default) 

String 

Output  stimulation  file 

(No  default) 

-bs 

Integer 

Input  buffer  size  (samples) 

50 

-mxf 

Integer 

Maximum  number  of  filters 

20 

-sr 

Integer 

Sampling  rate  in  Hz 

10000 

-sf 

string 

Main  parameter  specification  file 

"specfile" 

-df 

String 

DRC  specification  file 

"testdrc" 

-V 

Integer 

Diagnostic  Verbosity  level 

0 

-tc 

(None) 

Enable  timing  check  using  C30  timer 

FALSE 

-tr 

Integer 

Timer  resolution  in  us 

100 

-*io 

(None) 

Use  analog  interface  I/O  (AIO)  instead  of  file  system  I/O 

FALSE 

-ib 

Integer 

AIO:  number  of  input  buffers 

2 

-ob 

Integer 

AIO:  number  of  output  buffers 

2 

-rtm 

(None) 

AIO:  enable  real-time  modification  of  parameters 

FALSE 

-rtmup 

Integer 

AIO  real-time  update  time  (seconds) 

3 

If  i[n]  >  y[n  -  1],  then  the  AGO  is  defined  to  be  in  attack  mode,  and  y[n]  is  set  equal  to  ju[n].  On 
the  other  hand,  if  i[n]  <  y[n  -  1],  then  the  AGC  is  defined  to  be  in  release  mode,  and  y[n]  is  set 
equal  to  yfl[n|.  Once  the  mode  is  determined  and  y[n]  is  calculated,  y[n]  is  used  as  an  index  to  a 
lookup  table  to  determine  y[n],  the  gain  for  time  n.  The  final  output  of  the  AGC/DRC,  which  is 
used  as  the  input  to  the  filter  bank,  is 


z[n]  =  x[n]  x  y(n] 


(3) 


The  q  and  0  values  are  derived  from  the  attack  and  release  times,  t A  and  t/j,  set  in  the  main 
parameter  specification  file.  Given  an  A/D  sampling  period  r  in  seconds,  and  t A  and  t/j  specified 
by  the  user  in  milliseconds, 

PA  = 

fa  =  e-1000T  ltR 


(4) 

(5) 
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2097M-9 


VERSION-22 

BEGIN_PRONTEND 

I  specif;  gain  prior  to  ACC  in  dB 

pregain-0 

END_FRONTEHD 

BECXN_FXLTERS 

#  foraatt  filter-f ile-naoe  poat-gain-to-be-applied-dB 

filters /0000 . 0800.dat  -15.4 

filters/0700. 1300.dat  -9.4 

filters/1300. 2400.dat  -4.5 

filters/2300. 4400.dat  -0.0 

ENDFILTERS 

BEOIN_AGC 

f  values  in  Milliseconds,  first  line  either  'enabled*  or  'disabled* 

enabled 

attack-0 

release-250 

END  AGC 


Figure  9.  A  ebank  main  parameter  specification  file. 


raw.  io 


-100  -100  -50  -50 
-50  -50  -40  -10 
-40  -10  0  —6 


Figure  10.  A  cb&nk  dynamic  range  compression  file. 
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1  -Pa 

1  -  &R 


(6) 

(7) 


<*A  = 


OR  = 


The  DRC  function  g[  ]  is  provided  indirectly  by  the  user  to  cbani  in  a  DRC  specification  file  as  a 
piecewise  linear  function  of  desired  output  envelopes  e  as  a  function  of  estimated  input  envelopes 
y.  For  convenience,  the  user  specifies  this  function  by  providing  the  endpoints  (in  decibels)  of 
each  linear  component.  The  linear  gains  g  are  derived  by  the  linear  domain  division  of  e  by 
y  and  are  calculated  as  a  function  of  y  during  algorithm  initialization.  Three  piecewise  linear 
regions  have  been  specified  in  the  example  DRC  specification  file  of  Figure  10.  This  specification, 
shown  graphically  in  Figure  11,  is  linear  for  low  envelope  levels,  expansive  for  a  narrow  range  of 
intermediate  envelope  levels,  and  compressive  for  normal  envelope  values. 


NtTM-U 


Figure  11.  Example  of  dynamic  range  compression  curve. 


4.1.3  Filter  Bank  Design 

Finite  impulse  response  (FIR)  filters  were  employed  exclusively  in  this  digital  simulation  of 
the  Ineraid  stimulator  for  three  reasons.  First,  FIR  filters  have  a  distortion-free,  linear  phase. 
Second,  most  signal-processing  chips  can  be  programmed  easily  to  perform  FIR  filtering  at  the  rate 
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of  one  tap  per  cycle  with  very  low  overhead.  Finally,  when  performing  low-pass  filtering  followed  by 
downsampling,  FIR  filter  outputs  may  be  computed  at  the  more  efficient  downsampled  rate.  The 
Parks-McClellan  procedure  was  used  for  designing  sharp  transition,  rectangular  frequency  response, 
band-pass  filters  [9].  The  Kaiser  window  procedure  was  used  to  design  nonrectangular  filters  with 
responses  more  similar  to  those  obtained  with  analog  discrete  components  [6]. 

4.1.4  Waveform  Examples  from  cbank 

For  the  band-pass  filter  bank  shown  in  Figure  12  and  the  main  and  DRC  specification  files 
of  Figures  9  and  10,  Figure  13  shows  the  output  of  cbank  for  a  typical  speech  input.  Band-pass 
filter  outputs  such  as  these  were  used  to  drive  isolated  voltage/current  converters  that  provided  the 
stimulation  currents  for  four  Ineraid  implant  electrodes. 


2oa7w.ii 


FREQUENCY  (kHz) 

Figure  12.  Example  of  cbank  band-pass  filters. 


4.1.5  Six-Channel  Simulations 

Present  Ineraid  implant  assemblies  provide  six  signal  electrodes  and  a  common  ground  return. 
Up  to  now,  the  Ineraid  analog  hardware  stimulator  has  been  capable  of  driving  only  four  signal 
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it 


Figure  IS.  cbank  input  and  output.  Top  figure  shows  waveform  for  input  sentence : 
" We  finished  the  IRP."  Middle  figure  shows  four  channels  of  corresponding  cbank  output. 
Bottom  figure  shows  a  wideband  spectrogram  of  the  input. 


electrodes.  Because  the  cb&nk  Ineraid  simulation  is  capable  of  driving  six  channels,  it  is  possible 
to  run  experiments  using  the  full  capability  of  the  Ineraid  implant. 

4.2  Continuous  Interleaved  Sampling  Algorithm 

At  least  two  aspects  of  the  Ineraid  stimulator  design  contribute  to  degraded  performance. 
When  the  stimulation  currents  are  simultaneously  applied  from  a  group  of  band-pass  filters,  there 
is  an  electric  field  overlap  between  the  implanted  electrodes  that  results  in  crosstalk.  Second, 
performance  is  degraded  due  to  a  lack  of  dynamic  range  control  for  each  band-pass  filter  output. 
Aside  from  the  genera]  effect  upon  the  dynamic  range  of  the  entire  input  spectrum  controlled  by 
the  input  AGC  processing,  there  is  no  mechanism  for  fitting  the  dynamic  range  of  the  filter  outputs 
to  the  useful  perceptual  range,  from  threshold  to  maximum  acceptable  loudness,  of  each  driven 
electrode.  Output  gain  adjustments  for  each  channel  only  reflect  the  subject’s  detection  threshold. 

The  Continuous  Interleaved  Sampling  (CIS)  stimulator  shown  in  Figure  14  is  an  attempt 
to  overcome  both  shortcomings  of  the  Ineraid  stimulator  [12].  Instead  of  stimulating  electrodes 
with  continuous  outputs  from  a  bank  of  band-pass  filters,  the  CIS  stimulation  outputs  are  pulse 
trains  that  are  amplitude  modulated  by  the  output  envelopes  from  a  bank  of  band-pass  filters. 
As  shown  in  Figure  15,  the  pulse  outputs  are  skewed  in  time  so  that  electrodes  do  not  receive 
simultaneous  stimulation.  Rather  than  using  the  raw  band-pass  filter  output  envelope  to  modulate 
the  pulse  waveforms,  parallel  compressors  map  the  raw  envelope  values  into  the  dynamic  range  of 
the  subject’s  individual  electrodes.  Thus,  while  the  Ineraid  and  CIS  stimulators  both  use  a  bank  of 
band-pass  filters  to  generate  electrode  stimulations,  the  CIS  stimulator  transforms  the  continuous 
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Figure  H .  The  conftnuottj  interleaved  sampling  stimulator. 


filter  outputs  into  envelope  modulated,  nonoverlapping  pulses.  At  the  present  time,  this  CIS  design 
is  implemented  only  on  PISCES  and  does  not  yet  exist  as  a  wearable  stimulator  for  general  subject 
use. 


4.2.1  The  pbank  Program 

pbank  is  a  C  language  program  that  implements  the  CIS  algorithm.  As  in  the  case  of  the 
cbank  program  described  in  Section  4.1.1,  the  program  has  been  compiled  and  run  successfully  on 
a  Sun  SPARCstation  IPC  as  well  as  on  the  Sonitech  TMS320C30  board.  A  few  key  subroutines 
were  assembly  language  optimized  to  achieve  real-time  performance. 

When  running  on  the  SPARCstation,  pbank  reads  a  single-channel,  sampled  data  file  from 
the  host  file  system  and  produces  a  single  multichannel  file  as  output.  When  running  on  the  C30 
boards,  the  user  may  instruct  pbank  to  either  read  and  write  files  from/to  the  host  file  system 
through  SPOX  or  read  and  write  from/to  the  A/D  and  D/A  converters. 
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CHANNEL  1 
CHANNEL  2 
CHANNEL  3 
CHANNEL  4 
CHANNEL  5 
CHANNEL  6 


Command  line  arguments  and  specification  files  allow  each  of  the  CIS  processing  blocks  to 
be  altered  as  required  by  clinical  interactions  and  experiments.  Although  the  AGC  operation  is 
described  by  the  same  DRC  file  format  as  in  the  cbank  program,  a  more  complex  main  specification 
file  is  used.  The  output  compression  curves  are  specified  as  a  series  of  line  segments  as  in  the  DRC 
file  or  in  a  table.  Table  2  and  Figure  16  show  the  pbank  command  line  argument  list  and  an  example 
of  a  main  specification  file,  respectively.  In  the  following  sections  each  of  the  processing  blocks  is 
described  in  some  detail.  .Like  cbank,  pbank  was  also  designed  to  provide  an  arbitrary  number  of 
stimulation  channels,  making  it  possible  to  test  both  four-  and  six-channel  implementations. 

4.2/  Input  Filtering  and  AGC/DRC 

The  input  processing  blocks  consist  of  two  low-pass  filters  and  the  AGC.  The  first  low-pass 
filter  permits  downsampling  of  the  basic  system  sampling  rate.  This  rate  is  set  fairly  high  to  have  a 
narrow  output  pulse  width  from  the  D/A  converter  (e.g.,  a  sampling  frequency  of  32  kHz  allows  the 
D/A  converter  output  to  produce  a  minimum  pulse  width  of  31.25  /is).  Because  the  input  spectrum 
of  interest  is  only  0  to  8  kHz,  more  efficient  signal  processing  can  be  employed  by  downsampling  to 
16  kHz. 

AGC/DRC  is  exactly  the  same  process  described  for  the  cbank  program  and  operates  upon 
the  output  of  the  first  downsampling  filter  to  reduce  the  overall  input  dynamic  range  as  specified 
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Figure  15.  Pulse  outputs  from  pbank. 
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TABLE  2 


The  pbank  Command  Line  Arguments 


Flag 

Value 

Type 

Description 

Default 

String 

Input  speech  file 

(No  default) 

String 

Output  stimulation  file 

(No  default) 

-bs 

Integer 

Input  buffer  size  (samples) 

96 

-mxc 

Integer 

Maximum  number  of  channels 

20 

-sr 

Integer 

Sampling  rate  in  Hz 

48000 

-sf 

String 

Main  parameter  specification  file 

"specfile" 

-df 

String 

DRC  specification  file 

“testdrc” 

-V 

Integer 

Diagnostic  verbosity  level 

0 

-tc 

(None) 

Enable  timing  check  using  C30  timer 

FALSE 

-tr 

Integer 

Timer  resolution  in  us 

100 

-hwave 

Use  half-wave  rectification 

FALSE 

-nohilb 

Use  only  one  BFF  per  channel 

FALSE 

-nocomp 

No  output  compression 

FALSE 

•Icomp 

Linear  interp  output  compression 

FALSE 

•tcomp 

Table  lookup  output  compression 

FALSE 

-oclen 

Integer 

number  of  pts  (li:segs+l  tl:log2(tabentries)  in  out  compression 

32 

-ocname 

String 

output  compression  specification  file 

“octable" 

-aio 

(None) 

Use  analog  interface  I/O  (AIO)  instead  of  file  system  I/O 

FALSE 

-ib 

Integer 

AIO:  number  of  input  buffers 

2 

-ob 

Integer 

AIO:  number  of  output  buffers 

2 

-rtm 

(None) 

AIO:  enable  real-time  modification  of  parameters 

FALSE 

-rtmup 

Integer 

AIO:  real-time  update  time  (seconds) 

3 

-hware 

AIO:  hardware  control  mode 

FALSE 

-noup 

AIO:  no  upsampling 

FALSE 

-fenamel 

String 

Single  channel  front-ended  #1  output  file 

(No  default) 

-agcname 

String 

Single  channel  AGC’ed  output  file 

(No  default) 

-fename2 

String 

Single  channel  front-ended  #2  output  file 

(No  default) 

-ename 

String 

Multichannel  envelope  output  file 

(No  default) 

0 


300700-10 

fetm-amml/pbmmk/Uma_«mmrXmp/mpmdJUlb_t*  Hot  M  13  09:37:41  1993  1 

VEA3I0H-1 

■cciHrwrrtxD 

4  format:  i iltar-fila-nama  post -gain-to-ba-appiiad-dfl  downaampl ing-n : 1 
. ./mor0_attan/unityf 14000. lpf .dat  0.0  1 
.. /mora~attan/2kf 14000. lpf. dat  0.0  4 
I  all  chaaaala  90  thru  fir  fl.  which  90  thru  fir  12? 

4  laqand:  1 :thru, 0 :a*ip 

1  1*3  thru  firt2.  4*4  akip  fir#2 

2  11  9  0 

dtd  rnoirrcjrD 
begTk_agc 

4  valuaa  in  miiluaconda.  firat  lina  aithar  "anablad*  or  'duablad* 

diaablad 

attack*0 

r« loot o*230 

HKD  ACC 

BtcTx_EJfVELOPE 

I  format :  f lltar-f  Un-nama  post-gain-to-b#-appliad-dB  downsaapimg-n :  1 

I  axpacting  pair*  of  filtars,  1.0.  on#  hilb#rt  xform  pair  par  channal 

Cl44000.cos.dat  0.0  1 

elf4OOO.0in.dat  C.O  1 

c244000.eos.dat  0.0  1 

c244000.sin.dat  0.0  1 

c344000.cos.dat  0.0  1 

c3MOOO.iin.d0t  0.0  1 

c4414OOO.coi.d0t  0.0  4 

c4414OOO.ain.d0t  0.0  4 

c3414OOO.coi.d0t  0.0  4 

c3414OOO.iin.d0t  0.0  4 

c4414OOO.coi.d0t  0.0  2 

c4414000.iin.dat  0.0  2 

EXD  ENVELOPE 

BEGTn_DELAY 

#  tha’dalay  that  wa  want  to  add  batwaan  tha  hilbart  transform  and  tha 
I  low-pass  filtar  m  aach  channal.  This  numbar  it  tpacifiad  in  samplas; 

4  tha  actual  tima  in  iacondi  dapandi  on  tha  sampling  rata  at  tha  output 

I  of  tha  hilbart  transform.  Typically,  at  laast  on#  of  tha  valuaa  is 
4  loro;  etharwiaa.  wa'd  ba  adding  artificial  dalay. 

0  3  7  17  35  34 
EKD  DELAY 
B£G?N_SM0OTH 

4  format:  f iltar-fila-nama  post -gam-to-ba-ippliad-dB  downaampling-n :  1 
4  if  downsampling  10  nagativa,  it  maana  upsampling 

4  gaina  00  output  «tart#d  at  12,12,4,0,0,0;  now  adjuatad  for  paak  output 

. ./mora-attan/40044000.1pf .dat  12.0  3 

.. /aora^ittan/ 40044000 . lpf .dat  12.0  3 

. . /mora~attan/40044000.1pf .dat  12.0  3 

. ./«ora3attan/4OO04OOO.lpf .dat  12.0  3 

. ./mora~attan/400l 4000 . lpf .dat  12.0  3 

. . /mora~at tanZ40044000.1pf.dat  12.0  4 

EJTD  SMOOTH 

SEG2H_PM 

4  pulaa  modulation  pattam  apacifiad  aa  floata. 

4  on a  column  par  channal.  whan  wa  gat  to  tha  bottom,  jump  to  tba  top. 
matant*12 
0  0  0  0  0  -1 

0  0  0  0  0  1 

0  000-1  0 

0  0  0  0  1  0 

0  00-1  00 

0  0  0  1  0  0 

0  0  -1  0  0  0 

0  0  1  0  0  0 

0  -1  0  0  0  0 
0  1  0  0  0  0 

-l  0  0  0  0  0 

1  0  0  0  0  0 

EKD  PM 


Figure  16.  A  pbank  main  parameter  specification  file. 


25 


by  the  attack  and  release  time  constants  and  the  DRC  file.  The  second  low-pass  filter  is  used 
to  reduce  the  computation  for  the  subset  of  channels  whose  highest  frequency  is  below  2  kHz  by 
allowing  a  second  downsampling  for  that  subset  to  4  kHz.  Figure  14  shows  three  upper  frequency 
channels  driven  from  the  first  downsampling  filter,  and  three  lower  frequency  channels  driven  from 
the  second  downsampling  filter. 

4.2.3  Band-Pass  Envelope  Estimation 


Band-pass  envelope  estimation  is  required  to  provide  a  modulating  signal  to  the  pulse  train 
outputs.  Two  envelope  estimation  methods  were  implemented.  The  rectification  (or  detection) 
method  comprises  a  band-pass  filter  followed  by  a  strong  nonlinearity  such  as  a  full-wave  or  half¬ 
wave  rectifier.  The  nonlinearity  output  is  smoothed  by  a  low-pass  filter  to  eliminate  spurious 
harmonics.  The  quadrature  estimation  method  calls  for  creating  a  second  band-pass  output  which 
is  shifted  90°  in  phase  from  the  original.  These  two  signals  are  squared,  summed,  and  square 
rooted,  producing  an  estimate  of  the  envelope.  The  choice  of  estimation  procedure  is  specified  on 
the  command  line.  The  rectification  estimation  method  is  used  commonly  in  both  analog  and  digital 


systems.  Unfortunately,  full-  or  half-wave  rectification  generates  a  range  of  spurious  harmonics  that 
alias  in  the  sampled  data  domain.  Appendix  A  shows  that  the  quadrature  estimation  method  is 
somewhat  more  robust  to  harmonic  distortion  and,  consequently,  aliasing. 

The  filter  bank  divides  the  input  spectrum  into  six  channels  spaced  logarithmically  in  center 
frequency  and  bandwidth  over  the  range  from  300  to  7000  Hz.  Rectification  envelope  estimation 
requires  only  one  band-pass  filter  per  channel.  Either  of  the  techniques  described  for  the  cbank 
filter  bank,  namely,  Parks-McClellan  or  Kaiser  window  design,  provides  the  needed  flexibility. 
Quadrature  envelope  estimation  requires  one  quadrature  pair  of  band-pass  filters  per  channel.  Two 
approaches  for  designing  such  filter  pairs  have  been  studied.  In  the  first  approach,  a  prototype 
low-pass  filter  is  designed  using  the  Parks-McClellan  algorithm.  Two  band-pass  impulse  responses 
are  obtained  by  multiplying  the  low-pass  impulse  response  by  sampled  sine  and  cosine  functions 
whose  frequencies  are  at  the  desired  band-pass  filter  center  frequency.  Multiplication  by  sine  and 
cosine  guarantees  the  fixed  90°  phase  difference  [10].  A  second  design  technique  uses  the  eigenfilter 
method  developed  by  Nguyen  which  approximates  arbitrary  magnitude  and  phase  responses  in  a 
minimum  mean  square  error  sense  [8].  This  technique  was  used  to  generate  quadrature  filter  pairs 
with  analoglike  12  dB  per  octave  responses.  In  this  design  procedure,  there  are  small  differences 
in  the  magnitude  response  for  a  band-pass  pair  in  the  frequency  range  of  interest.  Presently,  only 
the  Parks-McClellan  frequency  shifted  filters  have  been  used  with  subjects. 


4.2.4  Low-Pass  Smoothing  Filters 

For  the  case  of  a  full-  or  half-wave  rectified  band-pass  filter  output,  the  low-pass  filtering 
process  eliminates  the  out-of-band  harmonics  that  have  been  generated  by  the  nonlinearity  but 
allows  the  basic  envelope  waveform  to  be  passed.  This  function  is  not  needed  for  the  quadrature 
derived  envelope  as  discussed  in  Appendix  A.  As  this  smoothed  envelope  will  be  the  only  informa¬ 
tion  provided  to  the  subject,  there  is  a  trade-off  between  wide  bandwidths  that  allow  for  maximal 
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envelope  variations  and  narrow  bandwidths  that  reduce  pitch  harmonic  ripple.  Additionally,  low- 
pass  filtering  allows  for  down-  or  upsampling  so  that  envelope  samples  are  generated  at  the  pulse 
sampling  rate.  At  present,  the  low  pass  filter  cutoff  response  resembles  a  low-order  analog  filter 
with  a  cutoff  frequency  at  about  400  Hz. 

4.2.5  Output  Compression  Mapping 

The  envelope  waveform  dynamic  range  is  mapped  into  the  measured  dynamic  range  for  the 
corresponding  electrode.  The  mapping  curve  is  one  of  the  variables  to  be  determined  from  clinical 
interaction.  Typical  transformations  that  map  from  x  (the  envelope  estimate  available  from  the 
low-pass  smoothing  filter)  to  y  (the  modulation  level)  are  given  by 

y  =  A  +  B(x  -  X)v  ,  (8) 

where  A ,  B,  X ,  and  p  are  dependent  on  subject  threshold  and  dynamic  range  measurements  and 
the  desired  compression  characteristic.  This  mapping  i  :pecified  independently  for  each  channel. 

. 

4.2.6  Pulse  Modulation  Waveform  Output 

• 

The  final  pulse  output  waveforms  for  each  channel  rrc  computed  at  the  full  input  sampling 
rate.  For  example,  a  32-kHz  rate  will  allow  a  channel  D/A  to  output  a  pulse  of  width  31.25  ps.  The 
specification  file  of  Figure  16  defines  the  pulse  sequencing  as  a  matrix  of  channels  versus  activity 
at  each  sampling  time  for  one  period  of  the  output  cycle.  The  matrix  shown  in  Figure  16  at  a 
sampling  rate  of  32  kHz  would  produce  the  six-channel  output  of  Figure  15,  with  a  pulse  repetition 
period  of  375  ps,  pulse  width  of  31.25  ps,  and  biphasic  pulse  width  of  62.5  ps.  The  biphasic  pulse 
shape  allows  for  a  zero  mean  output  signal*  while  retaining  a  narrow  pulse  shape;  as  discussed 
earlier,  the  skewed,  nonoverlapping  pulse  waveforms  eliminate  field  overlap  between  electrodes  in 
the  cochlea.  These  modulated  pulse  trains  are  output  to  the  voltage-to-current  converters,  which 
in  turn  stimulate  the  corresponding  electrodes. 

' 

4.2.7  Waveform  Examples  from  pbank 

Figure  17  shows  the  output  of  pbank  for  a  typical  speech  input. 


4 Each  electrode  is  stimulated  by  a  current  that  is  proportional  to  the  pbank  waveform  output.  A 
zero  mean  signal  results  in  delivery  of  zero  net  charge  by  the  electrode  to  the  cochlea,  thereby 
causing  minimal  trauma  in  the  surrounding  cochlear  tissue. 
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Figure  17.  pbank  input  and  output.  Top  figure  shows  a  waveform  for  input  sentence: 
* Massachusetts  Eye  and  Ear  Infirmary.  ”  Next  figure  shows  estimated  envelopes  for  all  six 
channels.  Third  figure  shows  a  close-up  of  the  pulse  output  in  the  middle  of  an  unvoiced 
fricative.  Bottom  figure  shows  wideband  spectrogram  of  input. 


5.  CLINICAL  EXPERIENCES  AND  RESULTS 


The  PISCES  system  was  installed  at  the  Massachusetts  Eye  and  Ear  Infirmary’s  (MEEI) 
Cochlear  Implant  Research  Laboratory  (CIRL)  in  early  July  1991.  The  output  of  each  D/A  charnel 
of  the  audio  interface  was  connected  to  an  isolated  current  stimulator  whose  gain  is  adjustable 
independently  and  whose  output  level  is  monitored  and  limited  to  a  preset  maximum  value.  This 
set  of  isolated  current  drivers  and  monitoring  circuits  is  contained  in  a  single  equipment  rack  outside 
of  a  small  sound-insulated  testing  room.  The  current  outputs  are  available  on  a  cable/plug  assembly 
inside  the  testing  room  so  that  subjects  can  unplug  their  own  stimulator  and  substitute  the  isolated 
output  currents  driven  by  PISCES.  In  addition,  a  “panic  button”  accessible  to  the  subject  allows  for 
a  rapid  disconnect  between  the  connector  and  the  current  drivers  in  an  emergency.  A  photograph 
of  PISCES  and  the  current  isolator /limiter  equipment  rack  is  shown  in  Figure  18. 


2097M-1C 


Figure  18.  Photograph  of  PISCES  and  current  isolator/limiter  equipment  at  MEEI 
CIRL. 
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The  first  clinical  interaction  presented  subject  S04  with  the  outputs  of  PISCES  running 
cbank.  Initially,  subject  S04  described  the  cbank  sound  processing  as  very  different  from  his  own 
Ineraid  hardware  stimulator.  However,  after  adjusting  the  four  band-pass  filters  to  approximate 
more  closely  the  analog  hardware  filters,  the  cbank  stimulations  were  judged  as  being  very  similar 
to  the  Ineraid  hardware  stimulations.  This  interaction  assured  us  that  PISCES  and  the  current 
isolator/limiter  equipment  were  capable  of  replacing  safely  and  adequately  the  Ineraid  hardware 
stimulator.  All  remaining  clinical  tests  focused  on  CIS  algorithms  as  implemented  in  pbank. 

S.l  Psychophysical  Measurements 

To  specify  the  output  compressions  of  the  pbank  system,  the  dynamic  range  of  each  of  the 
implant  electrodes  must  be  determined.  Dynamic  range  is  defined  as  the  difference  between  the 
signal  intensity  at  threshold  (i.e.,  a  just  perceivable  signal)  and  the  signal  intensity  that  is  just 
uncomfortably  loud  (UCL).  This  represents  the  range  over  which  stimulating  signals  can  be  usefully 
applied  and  perceived.  The  range  is  a  measure  of  the  physiological  state  at  the  site  of  electrode 
action  and  is  a  function  of  the  induced  current’s  proximity  to  still  functioning  auditory  nerves.  Not 
only  do  dynamic  ranges  vary  among  electrodes  but  they  are  also  functions  of  the  pulse  widths  and 
frequencies  used  for  the  stimulus  signal.  As  a  consequence,  it  is  important  that  the  dynamic  range 
be  measured  for  the  pulse  parameters  of  the  stimulator  in  use. 

The  dynamic  range  measurements  are  used  to  set  the  output  gains  and/or  compression  char¬ 
acteristics  of  the  stimulating  currents.  In  the  case  of  the  Ineraid  hardware  stimulator,  the  subject’s 
threshold  measurements  are  used  to  set  the  gains  for  each  filter  output.  For  pbank,  the  dynamic 
range  information  defines  the  output  range  of  the  compression  curve  for  each  channel.  Figure  19 
shows  a  typical  output  compression  curve  for  one  electrode.  The  peak  value  observed  from  the 
envelope  estimator  is  always  mapped  to  UCL.  The  input  dynamic  range,  defined  as  the  distance 
between  the  envelope  outputs  mapped  to  threshold  and  mapped  to  UCL,  is  adjusted  by  moving 
the  low  cutoff  value. 

5.2  Speech  Materials 

When  a  subject  is  initially  connected  to  PISCES  running  pbank,  an  informal  interactive 
conversation  between  the  subject  and  the  researcher  is  used  to  gauge  gross  performance.  This 
provides  useful  feedback  about  signal  levels  as  well  as  crude  comparisons  between  the  present  and 
previous  parameter  settings.  In  addition,  the  subject  has  the  opportunity  to  acclimate  to  each  new 
stimulator  variation,  thereby  providing  at  least  a  small  amount  of  learning  before  more  quantitative 
testing  takes  place.  In  addition,  the  subject  may  identify  crude  bugs  in  the  new  stimulation  system 
under  test.  During  this  phase  of  testing,  fine  tuning  of  the  isolator  gains  may  be  performed. 

The  main  quantitative  tests  of  speech  reception  were  based  on  measures  of  consonant  iden¬ 
tification.  These  tests  make  use  of  24  consonants  in  a  vowel-consonant-vowel  (VCV)  setting  (e.g., 
“asha,”  “aba”)  spoken  by  a  male  talker  and  available  on  a  laser  videodisc  from  the  University 
of  Iowa  [11].  An  IBM-PC/AT  control  program  written  specially  for  the  videodisc  database  (and 
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INPUT  RANGE  (e.g.,  10-60  dB) 


THIS  LOW  CUTOFF  TO  THE  INPUT  RANGE 
IS  ADJUSTED  EMPIRICALLY 


Figure  19.  A  typical  output  compression  curve. 


provided  by  researchers  at  the  Research  Triangle  Institute  [RTI])  plays  random  sequences  of  VCV 
utterances  and  tabulates  the  subjects’  responses  [13].  The  researcher  may  choose  to  use  subsets  of 
8  or  16  consonants  or  may  choose  to  use  the  full  24  consonant  set.  In  all  cases,  5  groups  of  the  8,  16, 
or  24  randomized  consonant  sets  are  presented.  The  use  of  this  consonant  test  and  test  controller 
program  allowed  direct  comparison  with  RTI  results. 

5.3  Subject  Experiments 

Six  subjects  were  connected  to  PISCES  running  pbank  as  described  below. 

5.3.1  Experience  with  Subject  S04 

Subject  S04  appears  to  have  good  nerve  survival,  as  shown  by  the  measured  dynamic  ranges 
for  the  six  monopolar  electrodes  of  the  Ineraid  implant  in  Table  3.  These  measurements  were  made 
for  200-Hz  biphasic  pulse  trains  with  a  500-ps  biphasic  pulse  width.  Table  4  shows  the  results  of 
the  24  consonant  VCV  tests  on  S04.  As  a  baseline,  his  score  using  his  Ineraid  hardware  stimulator 
averaged  79%  correct  over  six  separate  tests.  The  interaction  with  S04  using  PISCES  and  pbank, 
which  spanned  over  40  h  of  testing,  was  aimed  at  substantially  raising  his  speech  reception  score  by 
adjusting  pbank  parameters.  The  clinical  interactions  began  by  adjusting  pbank’s  channel  specific 
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TABLE  3 


Dynamic  Range  Measurements  for  Subjects  S04  and  S05 


Subject  S04 

Electrode  Number 

mm 

2 

3 

4 

5 

6 

Threshold  (/iA) 

32 

31 

33 

38 

40 

67 

UCL  (pA) 

210 

235 

285 

280 

1 

300 

Dynamic  Range  (dB) 

16.3 

17.7 

18.8 

17.3 

13.0 

Subject  S05  | 

Threshold  (/iA) 

73 

66 

71 

134 

EJ 

90 

UCL  (/iA) 

162 

177 

202 

250 

m 

163 

Dynamic  Range  (dB) 

6.9 

8.6 

9.1 

5.4 

m 

5.2 

Dynamic  Range  Differences,  S04-S05  | 

Difference 

EBi 

9.1 

mm\ 

11.9 

13.5 

7.8 

TABLE  4 


Subject  S04  Performance  on  24  VCV  Test 


Condition 

Variable 

Scores 

Ineraid  Hardware 

Baseline 

79%  (73-85) 

6-Channel  CIS 

FWR  envelopes 

12  dB/oct.BPFs 
Rectangular  BPFs 

87%  (83-92) 

96%  (95-96) 

6-Channel  CIS 

Rectangular  BpFs 

FWR  envelopes 
Quadrature  envelopes 

96%  (95-96) 

99%  (98-99) 

6-Channel  CIS 

Rectangular  BPFs 
Quadrature  envelopes 

Interleaved  pulses 
Coincident  pulses 

99%  (98-99) 

82%  (one  test) 

CIS 

Rectangular  BPFs 
Quadrature  envelopes 

6  Channels 

4  Channels 

99%  (98-99) 

89%  (one  test) 

-  r 


output  compression  curves  to  match  the  measured  values  of  S04’s  dynamic  range.  As  these  dynamic 
ranges  are  dependent  upon  the  pulse  widths  and  pulse  rates  used  for  the  stimulating  signals,  the 
compression  curves  were  recomputed  whenever  these  parameters  were  changed.  Generally,  mapping 
a  60-dB  envelope  dynamic  range  to  a  15-  to  18-dB  electrode  dynamic  range  was  optimal. 

Next,  two  different  filter  bank  designs  were  evaluated.  The  first  filter  bank  shown  in  Figure  20 
is  an  approximation  to  a  bank  of  analog,  second  order,  Butterworth  filters.  The  second  filter  bank 
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FREQUENCY  (kHz) 

Figure  SO.  Filter  bank  approximating  Butterworth  filters. 


shown  in  Figure  21  is  a  more  rectangular,  frequency  selective  design.  For  S04,  the  rectangular 
filters  were  superior,  and,  using  these  filters,  quadrature  processing  outperformed  the  full-wave 
rectification. 

Informal  experiments  varying  the  pulse  repetition  rates  in  the  range  from  1  to  2  kHz  with  cor¬ 
responding  pulse  widths  of  31  and  62  /is  for  each  subpulse  did  not  show  much  difference.  However, 
nonoverlapping  presentation  of  each  channel’s  pulse  stimulation  resulted  in  superior  performance 
versus  coincident  presentation,  suggesting  that  reducing  interelectrode  field  interaction  is  beneficial. 
Only  a  single  test  with  coincident  pulses  was  run  as  it  was  clear  from  the  subject’s  comments  that 
these  pulses  caused  a  significant  degradation  and  change  in  the  effective  loudness  of  the  signals. 
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Figure  SI.  Filter  bank  with  more  rectangular  filters. 


Testing  of  a  four-channel  CIS  stimulator  produced  scores  10%  less  than  the  six-channel  CIS 
system,  indicating  that  the  extra  two  channels  provide  added  benefit.  A  four-channel  CIS  outper¬ 
formed  the  Ineraid  stimulator  by  approximately  10%,  suggesting  that  the  interleaved  stimulation 
and  the  output  compression  also  contribute  to  the  superior  CIS  performance. 

The  interaction  with  S04  in  the  context  of  PISCES  running  pbank  demonstrated  the  basic 
hypothesis  of  the  IRP,  namely,  that  it  would  be  possible  to  improve  the  speech  reception  perfor¬ 
mance  of  cochlear  implant  subjects  by  modifying  stimulator  parameters  interactively.  Because  the 
surgical  placement  of  electrodes  and  number  of  surviving  neurons  varies  from  subject  to  subject,  it 
is  important  to  be  able  to  adjust  an  electrode  stimulator  for  each  subject’s  condition. 

It  is  worth  noting  that  S04  often  chose  to  remain  in  the  clinic  for  hours  listening  to  music 
through  the  PISCES  system.  Thus,  CIS  shows  promise  of  improving  acoustic  reception  for  a  wide 
range  of  inputs. 
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6.3.2 


Experience  with  Subject  S05 

Table  3  shows  that  subject  S05  has  a  narrower  set  of  dynamic  range  measurements  than  S04, 
presumably  due  to  poorer  nerve  survival  and  electrode  placement.  The  differences  between  the 
ranges  of  S04  and  S05  vary  from  9  to  13  dB.  Since  the  dynamic  ranges  are  quite  small  compared  to 
a  normal  hearing  range  of  over  100  dB,  these  differences  would  be  expected  to  yield  qualitatively 
different  behavior.  As  a  baseline,  S05  scored  34%  on  the  24  consonant  test  using  his  Ineraid 
stimulator  (cf.  S04's  79%). 

S05’s  results  using  CIS  «.re  shown  in  Table  5.  The  CIS  parameter  set  that  produced  the  best 


TABLE  5 

Subject  S05  Performance  on  16  VCV  Test 


Input  Dynamic  Ran,;*  (dB) 

Gain  (dB) 

60 

40 

30 

20 

10 

Linear  | 

-2 

49  0% 

■ 

0 

46.0% 

57.0%J 

n 

4 

61.2% 

■ 

10 

55  0% 

56.4% 

54.8% 

HH 

16 

63  9%{ 

57  8% 

24 

EH 

Of 

49.5% 

16t 

564% 

61  4% 

Compare  to  Ineraid  hardware  baseline  performance  of  48  9.  J 
t  Boost  of  3.  6,  9  and  12  dB  on  channels  3-6.  respectively. 
{Condition  for  which  two  test  runs  were  performed 


scores  for  S04  (i.e.,  a  system  using  a  60-dB  input  range,  quadrature  envelope  extraction,  sharp 
filters,  and  six  channels)  resulted  in  a  disappointing  result  of  23%  on  the  24  VCV  test.  This  result 
demonstrates  that  a  parameter  set  producing  good  performance  in  one  subject  may  not  be  good 
for  another. 

As  them  was  no  reason  to  assume  any  of  the  parameters  of  the  CIS  system  that  worked  best 
for  S04  should  not  be  a  good  starting  point  for  S05  (the  only  exception  being  the  subject  dependent 
compression  mappings  for  each  channel),  various  sets  of  input  compression  ranges  and  gains  were 
explored.  Careful  input  gain  adjustment  is  required  to  maximize  the  speech  activity  within  the 
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narrow  compression  range.  Testing  was  reduced  to  only  16  consonants  (a  subset  of  the  24)  to  keep 
S05  from  tiring  as  a  result  of  the  difficult  (for  him)  24  consonant  test.  Table  5  shows  the  result  of 
the  dynamic  range  exploration  and  again  points  out  the  importance  of  tuning  the  stimulator  for 
each  subject’s  individual  condition.  It  appears  that  S05  cannot  process  the  wide  dynamic  range 
inputs  that  S04  finds  most  usable  and  pleasant.  In  factj  when  the  input  range  available  to  each 
of  S05’s  electrodes  was  restricted,  his  performance  increased  significantly.  Notice  that  the  best 
simulation  system  used  for  S05  provides  an  average  score  for  two  test  runs  of  63.9%  correct  versus 
an  average  of  two  test  runs  for  the  Ineraid  stimulator  of  48.9%  correct. 

The  only  parameters  adjusted  for  S05  were  the  input  dynamic  range  of  the  compression 
curve  in  each  channel  output,  the  overall  gain,  and  the  per  channel  gain.  Even  this  simple  set  of 
parameter  adjustments  generates  a  large  space  of  possibilities  that  requires  many  hours  of  subject 
interaction.  If  this  space  were  explored  in  greater  detail,  the  results  might  still  be  a  strong  function 
of  other  stimulator  variables  (e.g.,  the  band-pass  filter  responses).  A  strategy  is  needed  that  allows 
convergence  to  an  overall  optimal  set  of  parameters  for  each  subject.  In  addition,  the  scores  for 
other  sets  of  speech  tests  must  be  examined  as  well. 

5.3.3  Brief  Experiences  with  Four  Subjects 

Four  additional  subjects  were  investigated  briefly,  each  in  a  single  session  of  approximately  3  h. 
The  results  of  these  sessions  as  well  as  the  best  scores  for  S04  and  S05  are  shown  in  Table  6.  Each  of 


TABLE  6 

Best  Performance  for  Six  Subjects 


Subject 

Number  of 

Test  Consonants 

Ineraid  Hardware 
Score 

Best  PISCES 
Score 

Comments 

S04 

24 

79%  (73-85) 

99%  (98-99) 

50  h  of  testing 

S05 

16 

49%  (44-54) 

64%  (61-66) 

10  h  of  testing 

S02 

24 

29%  (27-31) 

33%  (30-35) 

3  h,  chan  6  unused 

SOI 

24 

30%  (one  test) 

11%  (one  test) 

3  h.  5&6  unused 

S16 

8 

65%  (54-71) 

69%  (58-75) 

3  h 

S23 

16 

38%  (25-50) 

36%  (31-44) 

3  h 

these  subjects  was  tested  with  the  stimulator  found  to  be  optimum  for  subject  S04.  Notwithstanding 
whether  S04’s  parameter  set  may  be  a  good  starting  point  for  the  other  four  subjects,  it  also  became 
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clear  that  a  single,  short  clinical  interaction  is  not  sufficient  to  tune  a  stimulator  for  a  subject’s 
particular  needs. 

Experience  with  Subject  S02:  Testing  S02  on  the  24  consonant  test  through  his  Ineraid 
stimulator  produced  an  average  score  of  29.1%  for  two  separate  test  runs.  After  computing  output 
compression  curves  based  on  dynamic  range  measurements  and  setting  current  isolator  gains,  the 
subject  observed  that  channel  6  caused  uncomfortable  sensations  of  feeling  rather  than  sound.  As 
a  result,  channel  6  was  not  included  in  the  CIS  system  for  this  subject. 

Two  separate  test  runs,  each  presenting  the  24  consonant  set  through  the  CIS  stimulator, 
resulted  in  an  average  score  of  32.5%.  This  difference  in  performance  relative  to  the  Ineraid  stim¬ 
ulator  was  small.  In  fact,  S02  preferred  his  own  Ineraid  stimulator.  In  future  sessions,  the  plan 
is  to  adjust  input  compression  ranges  as  done  for  S05  and  to  adjust  the  band-pass  filter  ranges  to 
compensate  for  the  lack  of  stimulation  from  channel  6.  Future  sessions  with  S02  might  also  focus 
on  optimizing  a  four-channel  CIS  simulation  before  trying  to  integrate  the  fifth  channel. 

Experience  with  Subject  SOI:  SOI  scored  30%  on  the  24  consonant  test  using  his  Ineraid 
stimulator.  When  electrodes  5  and  6  (unused  by  his  Ineraid  hardware  stimulator)  were  stimulated 
to  measure  dynamic  range,  he  had  great  difficulty  making  judgments  of  loudness  and  perceived 
sensation.  SOI  has  been  deaf  for  many  years,  and  he  found  the  sensations  on  electrodes  5  and  6  60 
unusual  that  he  could  not  judge  whether  the  current  levels  were  reasonable.  As  a  result,  the  gains 
of  these  two  channels  were  set  at  arbitrary  levels  for  CIS  stimulation.  He  scored  11%  on  the  24 
VCV  test. 

SOI  presents  a  distinct  challenge  in  trying  to  extend  stimulation  to  electrodes  5  and  6.  As  in 
the  case  of  S02,  it  might  be  worthwhile  to  start  the  next  session  by  optimizing  a  four-channel  CIS 
simulation  before  moving  to  five  or  six  stimulation  channels. 

Experience  with  Subject  S16:  For  S16,  it  was  possible  to  set  up  and  use  all  six  electrodes. 
As  data  regarding  threshold  and  UCL  for  each  electrode  were  available  only  at  a  wider  pulse  width 
of  100  /is  per  phase,  the  subpulse  width  was  increased  to  125  ps,  and  the  pulse  repetition  rate  was 
lowered  to  500  Hz.  Testing  with  Sl6’s  Ineraid  resulted  in  a  score  of  65.3%  averaged  over  three 
test  runs  using  an  eight  consonant  subset.  Because  S16  complained  that  the  CIS  was  driving  his 
electrodes  too  strongly,  the  peak  currents  were  reduced  to  0.75  of  the  original  values.  At  these 
settings,  three  test  runs  for  the  eight  consonant  test  gave  a  score  of  71%.  Attenuating  the  current 
gains  by  another  0.75  and  running  two  more  tests  gave  an  average  of  66.5%.  The  overall  average 
for  the  five  tests  was  69.2%. 

The  slight  increase  in  consonant  score  when  using  CIS  may  be  significant.  Again,  there  are 
many  parameters  to  be  adjusted  for  S 16,  including  choosing  the  current  isolator  gains  that  best 
use  his  dynamic  range.  The  most  encouraging  result  of  this  session  was  Sl6’s  comment  that  he 
is  “understanding  more  than  he  ever  has”  of  ordinary  conversation  with  the  CIS  simulation  in  its 
present  unoptimized  version. 


Experience  with  Subject  S23:  S23  scored  37.6%  when  using  the  Ineraid  stimulator  for  the  16 
consonant  test.  Tests  of  the  initial  CIS  stimulator  produced  a  score  of  36.2%  using  the  16  consonant 
subset.  S23’s  comment  that  “the  clarity  is  good”  for  CIS  compared  to  Ineraid  gives  us  optimism 
that  it  will  be  possible  to  converge  to  a  set  of  parameters  that  yields  substantial  increases  in  speech 
reception  compared  to  the  Ineraid. 

6.4  Comments  on  the  Clinical  Experience 

Because  clinical  interaction  using  systems  running  real-time,  digital  speech  algorithms  con¬ 
nected  to  cochlear  implants  was  a  new  activity  for  both  the  Lincoln  and  MEEI  staff,  there  was 
a  steep  learning  curve  from  July  until  October.  By  October,  it  became  easier  to  run  automated 
speech  tests,  to  shift  parameters  easily,  and  to  set  reasonable  initial  CIS  parameters. 

None  of  the  subjects  described  have  used  the  CIS  systems  for  lengths  of  time  comparable  to 
the  time  they  have  used  their  Ineraid  stimulators.  In  fact,  except  for  S04  and  S05  who  underwent 
some  previous  testing  at  RTI,  none  of  the  other  subjects  had  any  previous  CIS  experience  before 
the  single  testing  sessions  reported.  Speech  reception  with  CIS  processing  would  be  expected  to 
improve  with  additional  use  and  with  continued  interactive  searching  for  better  parameter  sets. 
Several  subjects  commented  that  use  of  a  new  CIS  stimulator  in  place  of  the  Ineraid  hardware 
would  improve  their  day-to-day  speech  reception  performance. 
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6.  CONCLUSIONS 


This  report  has  summarized  an  Innovative  Research  Program  project  aimed  at  improving 
cochlear  implant  stimulators.  Drawing  upon  Lincoln  Laboratory  expertise  in  speech  coding,  digital 
signal-processing  theory  and  design,  and  facilities,  an  interactively  adjustable  implant  stimulator, 
PISCES,  was  designed,  built,  and  tested.  With  this  system  installed  at  MEEI/CIRL,  speech 
reception  of  several  implanted  subjects  was  improved.  Flexible  interaction  in  a  clinical  environment 
also  enables  new  designs  to  be  explored  and  tested. 

A  need  still  exists  for  a  wearable/portable  stimulator  capable  of  running  the  algorithms  eval¬ 
uated  on  PISCES.  Such  a  device  should  be  easily  reprogrammed  both  with  respect  to  algorithm 
and  parameter  set  and  would  provide  critical  information  about  long-term  learning  effects  for  spe¬ 
cific  stimulator  algorithms.  In  addition,  mechanisms  for  converging  toward  optimal  parameter  sets 
for  a  given  subject  and  a  given  stimulator  algorithm  must  be  developed,  as  even  the  simplest  of 
algorithms  has  a  very  wide  space  of  adjustable  parameters.  The  National  Institutes  of  Health  will 
be  funding  a  three-year  program  at  MIT  involving  all  the  authors  for  extending  PISCES  in  these 
areas. 

Although  this  was  not  the  first  effort  to  provide  an  ability  to  modify  cochlear  implant  stimu¬ 
lators  interactively  in  a  clinical  environment  (both  the  RTI  [13]  and  Melbourne  [1]  groups  preceded 
this  effort),  this  effort  is  unique  in  employing  a  laboratory  stimulator  capable  of  running  floating¬ 
point  algorithms,  supporting  a  high-level  programming  language,  and  providing  16-bit  analog  I/O. 
These  differences  may  become  more  important  over  the  next  few  years  as  CIRL  continues  to  utilize 
PISCES. 


APPENDIX  A 

The  Advantages  of  Quadrature  Envelope  Estimation 

The  envelope  of  an  analog  signal  is  often  estimated  by  applying  a  strong  nonlinearity  such 
as  a  full-wave  or  half-wave  rectifier.  The  output  of  the  rectifier  drives  a  low-pass  smoothing  filter, 
which  eliminates  spurious  harmonics  generated  by  the  process.  In  the  sampled  data  domain,  due  to 
aliasing,  the  spurious  harmonics  generated  by  the  full-wave  or  half-wave  rectifier  cannot  (generally) 
be  removed  by  low-pass  filtering.  As  a  consequence  of  this  concern,  an  envelope  estimator  has  been 
implemented  that  uses  a  pair  of  band-pass  filters  identical  in  frequency  response  but  90°  out  of 
phase  (“quadrature”  filters).  The  outputs  of  these  quadrature  filters  are  squared,  summed,  and 
square  rooted,  resulting  in  a  different  envelope  estimate.  The  following  discussion  describes  some 
of  the  differences  between  rectification-  and  quadrature-based  envelope  estimation. 

Consider  a  single  sinusoidal  input  to  a  band-pass  filter.  The  output  of  the  band-pass  filter  is 
y(t)  =  A  cos(ut)  .  (A.l) 

Full-wave  rectification  yields  a  signal  whose  Fourier  series  expansion  is 


|y(t )|  =  2A/x  +  (4-4/3* )  cos  2 ut  -  (4-4/15*)  cos  4 ut  +  (4A/35*)  cos  6 u>t  + _ (A.2) 

This  operation  has  generated  a  series  of  even  harmonic  terms.  In  the  analog  domain,  these  terms 
would  be  eliminated  with  a  low-pass  smoothing  filter  operation,  and  the  result  would  be  a  term  at 
DC  representing  only  the  input  amplitude  (i.e.,  the  envelope).  In  the  sampled  data  domain,  these 
harmonics  of  the  input  frequency  at  four  and  six  times  the  input  frequency  may  produce  aliasing 
of  the  input  frequency  in  the  bandwidth  of  the  low-pass  filter.  For  example,  a  sinusoid  at  2700  Hz 
will  produce  a  sixth  harmonic  at  a  frequency  of  16,200  Hz  which  will  be  aliased  to  200  Hz  for  a 
system  sampling  rate  of  16  kHz.  From  the  series  expansion,  this  aliased  harmonic  would  be  about 
2/35  of  the  DC  term  or  about  6%.  Higher  input  frequencies  could  produce  aliasing  of  the  fourth 
harmonic  term  at  even  higher  levels. 

For  the  envelope  produced  by  the  quadrature  operations,  the  two  band-pass  filter  outputs  are 
Hilbert  transforms  of  each  other  (i.e.,  their  outputs  are  shifted  by  90°),  yielding 

yi(t)  =  A  cos(vt)  and  (A.3) 

y3(t)  =  Asin(wt)  .  (A.4) 

Next,  the  sum  of  squares  signal  is  calculated  as 


y2  +  y|  =  A2  cos 2ut  +  A2sin2ut  =  A2 


(A.5) 
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The  squared  term  has  no  spurious  frequencies,  and  the  final  square  root  operation  produces  the 
constant  envelope  value  A.  Notice  that  a  low-pass  smoothing  operation  is  not  even  required  for 
this  simple  case  as  no  harmonics  of  the  signal  are  generated. 

For  a  more  elaborate  rase  of  a  two-sinusoid  signal  in  the  passband  of  a  band-pass  filter,  the 
output  (ignoring  phase  offsets)  is  of  the  form 

y(f)  =  A  cos ut  +  B  cos(u  +  A )t  .  (A.6) 

The  output  of  a  full- wave  rectifier  operation  is  difficult  to  quantify  for  even  this  simple  two-sinusoid 
case,  but  one  can  speculate  that  harmonics  of  the  input  frequencies  as  well  as  sum  and  difference 
products  will  be  produced  causing  aliasing  for  the  higher  frequency  inputs  and  the  higher  order 
distortion  terms. 

Considering  the  quadrature  processing  of  the  two-sinusoid  signal,  the  outputs  would  be 

Vi(t)  =  Acosut  +  2?cos(u  +  A)i  and  (A.7) 

y2  (t)  =  i4sinwf  +  Hsin(w  +  A)t  .  (A.8) 

The  sum  of  the  squares  of  the  two  band-pass  filter  outputs  becomes 

V\  +  y\  =  A2  +  B"1  +  2/15  cos  At  .  (A.9) 

Notice  that  this  signal  contains  only  constant  amplitude  terms  and  a  term  which  represents  the 
difference  between  the  two  sinusoids  (in  voiced  speech,  this  would  be  the  pitch  frequency).  These 
terms  represent  the  squared  envelope  term  with  no  spurious  frequencies  in  need  of  suppression  by 
a  low-pass  filter  or  susceptible  to  aliasing.  Unfortunately,  the  square  root  operation  required  to 
generate  the  envelope  amplitude  does  generate  spurious  harmonics.  However,  these  harmonics  are 
located  only  at  multiples  of  the  difference  frequencies  (the  pitch  harmonics),  not  the  original  input 
frequencies.  For  a  general  sum  of  sinusoids  as  an  input  to  the  pair  of  band-pass  filters  and  the 
Hilbert  envelope  process,  the  squared  output  will  only  contain  DC  terms  reflecting  the  energy  of 
each  sinusoid  and  sinusoidal  terms  at  each  of  the  possible  difference  frequencies.  The  process  of 
taking  the  square  root  will  generate  harmonics  of  these  difference  frequencies,  but  not  harmonics  of 
the  original  input  signals  so  that  there  is  a  lower  probability  of  aliasing  spurious  energy  back  into 
the  low-pass  filter  range.  Note  that  as  the  envelope  estimate  is  applied  to  a  nonlinear  compression 
curve  before  it  modulates  the  output  pulse  train,  even  a  perfect  envelope  signal  containing  pitch 
harmonics  would  generate  harmonics  of  the  pitch  signal. 

For  the  Hilbert  envelope  case  where  there  are  only  difference  frequencies  and  some  amount  of 
difference  frequency  harmonics  because  of  the  root  operation,  the  envelope  can  be  downsampled  for 
computational  savings.  If  the  rooting  distortion  is  ignored,  then  the  highest  difference  frequency  in 
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any  band-pass  filter  Hilbert  envelope  output  will  be  the  difference  between  the  lowest  and  highest 
frequency  sinusoids  that  can  be  passed  by  that  filter — a  difference  frequency  equal  to  the  passband 
width.  In  that  case,  the  envelope  waveform  can  be  sampled  at  twice  the  bandwidth  of  the  filter. 
If  the  output  of  a  band-pass  filter  was  a  conventional  amplitude  modulated  signal  with  the  full 
sidebands  fitting  into  the  filter  bandwidth,  then  the  modulating  signal  (the  envelope)  would  be  half 
the  bandwidth  of  the  filter.  In  this  case,  the  envelope  waveform  could  be  sampled  at  a  rate  equal 
to  the  bandwidth.  Since  one  cannot  model  the  band-pass  outputs  as  simple  amplitude  modulated 
signals,  one  must  be  guided  by  the  two  times  bandwidth  rule.  To  be  somewhat  conservative,  higher 
rates  have  been  employed  when  possible. 

For  the  case  of  outputs  from  rectification  operations,  there  is  no  expectation  that  such  signals 
are  band  limited.  Conservative  design  principles  would  dictate  upsampling  the  band-pass  filter  o  jt- 
puts  before  applying  the  rectification  operations,  thereby  lowering  the  chances  of  aliasing  harmonics 
of  the  band-pass  outputs  back  into  the  band  of  interest. 
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